Entry Name:  "VT-Wang-MC1"

VAST Challenge 2015
Mini-Challenge 1

 

 

Team Members:

Junpeng Wang, Virginia Tech, junpeng@vt.edu    PRIMARY
Ji Wang, Virginia Tech, wji@vt.edu

Chris North, Virginia Tech, north@vt.edu 

 

Student Team:  YES

 

Did you use data from both mini-challenges?  YES

 

Analytic Tools Used:

MoveView, developed by the team for the challenge.

Spectrum, developed by the team for the challenge.

Tableau

Gephi

 

Approximately how many hours were spent working on this submission in total?

200 hours.

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2015 is complete? Yes

 

Video Download

Video:

http://people.cs.vt.edu/~junpeng/html/video/VT-Wang-MC1.wmv

 

 

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Questions

MC1.1Characterize the attendance at DinoFun World on this weekend. Describe up to twelve different types of groups at the park on this weekend. 

a.       How big is this type of group?

b.       Where does this type of group like to go in the park?

c.       How common is this type of group?

d.       What are your other observations about this type of group?

e.       What can you infer about this type of group?

f.        If you were to make one improvement to the park to better meet this group’s needs, what would it be?

Limit your response to no more than 12 images and 1000 words.

Group 1: 

a.       This group had around 8 people.

b.       They went to the Stage Show (Location 63 in the map) twice (8:45-12:15, 13:45-17:15) on Friday and Saturday, but only once (8:45-12:15) on Sunday.

c.       Only one group we found.

d.       They always followed the same path from park gate to Location 63; they only stayed in 63 (2 hours/visit) and were not interested in other attractions.

e.       People in this group were probably the bodyguards or staffs working closely with the soccer star.

f.        Suggestion: (1) Secure the path they used from park gate to Location 63 between 8:45-9:30, 11:30-12:15, 13:45-14:30 and 16:30-17:15; (2) arrange some stage shows at other locations after 11:00 and 16:00, so that Location 63 could be less crowded. The following figure (of the visualization tool MoveView) shows IDs in this group and the path the group used.

1

Group 2: use K-means and Community Detection algorithms to cluster groups. Factors considered: location visited, duration stayed at a location, location visited order.

a.       These groups had around 4-8 people

b.       They came in early morning and were very interested in Thrill Rides in Wet Land and Tundra Land areas. Different from most of the other visitors who visited Shopping area in the evening; these groups were not interested in shopping.

c.       We found around 20 such groups on Friday; around 45 groups on Saturday; around 70 groups on Sunday.

d.       Group members were separated into 2-3 subgroups. Different subgroups used different paths. Group members communicated intensively via their devices.

e.       Group members may have different preferences in rides, thus they separated into subgroups. Most of the members should be young visitors (as they were interested in dangerous rides). The following figure shows how we found these groups in Spectrum.

2

Group 3:

a.       These groups had around 40 people.

b.       They usually came around 9:30 and were interested in Thrill Rides, Shows and Beer.

c.       We found around 4 such groups on Friday; around 12 groups on Saturday; around 15 groups on Sunday.

d.       Members in these groups followed the same path. Members only communicated with others inside the same group.

e.       Members in the same group may belong to the same organization or they were organized by certain organization, as the group size was really big.

f.        Beer Garden locations need to be big enough to hold so many people of a group. The following figure shows how Spectrum distinguishes these groups.

3

Group 4:

a.       Around 2 people.

b.       These groups were interested in Shopping areas. They spent most of their evenings in the Shopping areas.

c.       These groups were very common through the three-day weekend.

d.       Group members followed the same path in the park and they communicated a lot with each other. 

e.       We believe these groups were couples or small families. The following figure is the result from the visualization tool Spectrum and Gephi.

mc2

Group 5:

a.       Around 4-8 people

b.       They were interested in many types of locations, but they did not stay any location for long time.

c.       We found around 15 such groups on Friday; around 35 groups on Saturday; around 45 groups on Sunday.

d.       They usually left the park around 13:00 and came back around 16:00. Several of these groups rarely went to Food or Beer locations.

e.       Members in the group may be not interested in the food of the park. They were interested in food outside of the park.

f.        Give more advertisements of the food in the park; provide diverse types of food.  The following figure shows how the filters in Spectrum help us find these groups.

4

Group 6:

a.       Group sizes ranged from 2 to 8.

b.       These groups spent a lot of time in Kiddie Land and Wet Land areas. They were interested in Shopping, Kiddie Rides, Food and Show. They never went to Beer Garden locations.

c.       These groups were very common across three days.

d.       They went to shopping locations in the evening and did not stay in the park to very late.

e.       These groups may have children. The following figure demonstrates the Spectrum visualization results.

 

8

 

Group 7:

a.       Usually not very big. The group size could be 1, 2, 4, 5 and 7.

b.       These groups stayed at a location for very long time (around or more than 5 hours). Locations included Food, Beer, Show, slow Rides and Rest areas.

c.       We found 3 groups on Friday and 2 groups on Sunday, as showed in the following figures.

555

d.       One group came very late on Friday (group 1 on Friday). This group was interested in the Food and Beer locations in the park.

e.       The average age of these groups may be old. They needed more time to rest. Also these groups may be the suspects of the incident, as they stayed at a location for very long time.

f.        The park service can send messages to these groups, asking if they need any help. These groups, especially group 2 on Sunday (the following figure gives the filtered results of Spectrum), deserve more attentions.

3

 

 

 

MC1.2 – Are there notable differences in the patterns of activity on in the park across the three days?  Please describe the notable difference you see.

 

Limit your response to no more than 3 images and 300 words.

1.       Group 1 in MC1.1 did not come for the second Stage Show at location 63 on Sunday afternoon. The following figure shows this group’s movement on Friday, Saturday and Sunday in Spectrum.

d1

2.       More percentages of the visitors stayed in the Kiddie Land (yellow) area on Friday than Saturday and Sunday. More percentages of the visitors took the Thrill Rides (red) on Saturday and Sunday than it was on Friday. The following picture shows the high-resolution images rendered by Spectrum. We put these images on four 4K displays to navigate and explore them.

New Microsoft PowerPoint Presentation3

 

3.       There were more visitors on Sunday than Saturday; and Saturday had more visitors than Friday. The amounts of visitors were: Friday 3556; Saturday 6409; Sunday 7568.

 

 

 

MC1.3What anomalies or unusual patterns do you see? Describe no more than 10 anomalies, and prioritize those unusual patterns that you think are most likely to be relevant to the crime.

 

Limit your response to no more than 10 images and 500 words.

 

1.       On Sunday, we found 7 IDs that stayed in a restroom for more than 5 hours (see the answer to MC1.1, Group 7). These IDs are: 227221, 392618, 1095309, 1336607, 1483705, 1722376 and 2063022. There was no message sent from/received by these IDs after 15:03. The following figure from Tableau supports this fact.

gc2

 

2.       The soccer star did not come for the second Stage Show at location 63 on Sunday (we got this by tracing the 8 IDs (Group 1 in MC1.1) in MoveView). The first difference in MC 1.2 provides more details about this point.

 

3.       On Sunday, around 12:00, messages sent to “External” reached the peak. Most of the messages were sent from Wet Land area. The following figure from Tableau shows the peak.

n2

 

4.       On Sunday, around 12:05, messages sent to ID 839736 (we believe this ID is a Park Service ID) reached the peak. Most of these messages were sent from Wet Land Area (the location was 32; we found this in MoveView).

 

5.       On Sunday, around 14:50, messages sent to ID 839736 reached another peak. Most of these messages were sent from Coaster Alley (location 63). People may ask questions about why the Stage Show (in location 63) was canceled. The following figure from Tableau shows the peaks.

n3

 

6.       On Sunday, between 12:00-12:30, 14:35-15:05, ID 839736 (in Entry Corridor, most probably location 62) replied a lot of messages, as shown in the following picture from Tableau.

n4

 

7.       The last movement record of 898576 on Sunday is “2014-6-08 10:19:59,898576,check-in,43,56”, He/She stayed at location 24 (Kauf’s Lost Canyon Escape) from 10:20 to the end of the day. Although the ID did not move after 10:20, he/she still communicated with others. The last message this ID received is “2014-6-08 22:03:27,1860467,898576,Tundra Land”; the last message this ID sent out is “2014-6-08 21:27:24,898576,972256,Wet Land”. This ID communicated with 160 unique IDs (including “External”). The following figure shows this ID in Spectrum.

n8

 

8.       On Friday, ID 657863 was always moving with a group of five people. The group checked in location 20 around 21:53. Four other people of the group left around 22:20. But he/she did not leave. The last record of this ID is “2014-6-06 21:53:11,657863,check-in,6,43”. This visitor did not receive any message after 21:25:39. The following two figures show this group in Spectrum and MoveView.

n6n62

 

9.       On Friday, one group of four people did not move after the first Stage Show at Location 63. They stayed there and attended the second show in the afternoon. Two consecutive records of an ID in this group look like:

“2014-6-06 09:51:08,629048,check-in,76,22”

“2014-6-06 13:00:01,629048,check-in,76,22”

n9

 

10.   On Friday, Saturday and Sunday, between 12:00-12:55, 14:00-14:55, 16:00-16:55, 18:00-18:55, 20:00-20:55, ID 1278894 broadcast messages to significant amount of visitors every 5 minutes. The following figure from Tableau verifies this fact.

n6 - Copy

The following figure demonstrates one of the broadcasts from ID 1278894 in MoveView. This figure also supports the 6th anomaly i.e. ID 839736 replied a lot of messages to visitors at Location 32 around 12:05.

 n6